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1 . ^Introduction 

The  main  purpose  of  routing  and  flow  control  in  a  communication  net¬ 
work  is,  roughly  speaking,  to  keep  delay  per  message  withing  an  acceptable 
level  while  minimizing  the  amount  of  offered  traffic  that  is  rejected  by 
the  network  due  to  its  inability  to  handle  it.  These  two  objectives  are 
clearly  contradictory  so  a  good  routing  and  flow  control  scheme  must 
strike  a  balance  between  the  two.  It  should  also  take  into  account  a 
number  of  other  issues  such  as  fairness  for  all  users,  the  possibility 
that  the  network  topology  can  be  altered  due  to  unexpected  link  or  node 
failures,  and  the  fact  that  the  statistics  of  offered  traffic  change 
with  time. 

In  these  notes  we  consider  some  aspects  of  routing  and  flow  control  for  long- 
haul  wire  data  networks  in  which  the  communication  resource  is  scarce  (as 
opposed  to  local  'networks  such  as  Ethernet  where  it  is  not) ,  and  where 
there  are  no  issues  of  contention  resolution  due  to  random  access  of  a 
broadcast  medium  (as  in  some  satellite,  local,  and  packet  radio  networks). 

We  place  primary  emphasis  on  optimal  procedures  since  these  offer  a  more 
sound  philosophical  basis  than  heuristic  schemes  and  also  provide  a  yard¬ 
stick  for  measuring  the  effectiveness  of  other  methods. 


We  consider  primarily  the  subject  of  routing  in  a  quasistatic  offered 
load  environment.  By  routing  we  mean  the  set  of  decisions  regarding  the 
outgoing  links  to  be  used  for  routing  data  at  each  subnetwork  node.  By 
quasistatic  environment  (see  [1])  we  mean  a  situation  where  the  offered 
traffic  statistics  for  each  origin-destination  pair  change  slowly  over 
time  and  furthermore  individual  offered  traffic  sample  functions  do  not 
exhibit  frequently  large  and  persistent  deviations  from  their  averages. 

A  typical  quasistatic  network  is  one  accomodating  a  large  number  of 
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interactive  users  for  each  origin-destination  pair  and  in  which  the  law 
of  large  numbers  approximately  takes  hold.  In  such  an  environment  it  is 
valid  to  base  routing  decisions  on  average  levels  of  traffic  input  flows 
which  can  be  estimated  from  past  history  measurements. 

Situations  where  the  quasistatic  assumption  is  not  valid  are 
typically  characterized  by  the  presence  of  few  and  large  users  that  can  by 
themselves  overload  the  network  over  brief  periods  of  time  if  left  uncon¬ 
trolled.  We  then  talk  of  a  need  to  provide  dynamic  routing.  By  this  we 
mean  short  term  adjustment  of  routes  to  adapt  to  the  instantaneous  net¬ 
work  state  which  includes  instantaneous  traffic  input  rates  as  well  as 
queue  lengths.  Dynamic  routing  is  a  subject  that  is  insufficiently  well 
understood  at  present  and  should  probably  be  studied  in  combination  with 
flow  control.  We  will  not  consider  it  further  in  these  notes. 

While  there  are  situations  where  routing  can  be  considered  in 
isolation  from  flow  control,  in  other  cases  the  interactions  between 
routing  and  flow  control  are  so  strong  that  they  cannot  be  ignored  in  a 
meaningful  analysis.  By  flow  control  we  mean  the  set  of  decisions  regard¬ 
ing  the  amount  of  traffic  to  be  admitted  in  the  network  for  each  origin- 
destination  pair  or  each  user  pair  conversation.  It  is  intuitively  clear 
that  if  data  is  routed  efficiently  within  the  network  then  more  traffic 
can  be  admitted  into  the  network  without  violating  the  users'  dissatisfaction 
threshold.  Therefore  incremental  changes  in  routing  can  be  expected  to 
have  an  effect  on  the  amount  of  traffic  that  flow  control  should  allow  to 
enter  the  network.  On  the  other  hand  routing  changes  should  take  into 
account  the  concurrent  effects  of  flow  control  if  they  are  to  be  effective. 
The  resulting  coupling  between  routing  and  flow  control  can  be  quite  complex 


and  only  recently  there  has  been  substantial  progress  towards  understand¬ 
ing  it.  Some  of  the  most  important  work  in  this  area  [10], [11]  is  described 
in  the  last  section  where  a  combined  routing  and  flow  control  optimization 
problem  is  formulated.  It  turns  out  that  this  problem  is  essentially  the  same 
as  the  optimal  routing  problem  and  can  be  solved  by  simple  adaptation  of 
the  type  of  algorithms  described  in  these  notes.  Another  related  subject 
of  considerable  current  interest  is  routing  and  flow  control  of  real¬ 
time  data- -that  is  data  that,  if  not  delivered  within  a  specified  time 
delay,  becomes  useless.  Digitized  voice  is  a  prime  example  of  such 
data.  We  refer  to  [12]  for  related  work. 
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2.  Routing  Variables  in  Quasistatic  Routing 

Traffic  congestion  in  a  quasistatic  data  network  can  be  reasonably 
well  evaluated  in  terms  of  the  arrival  rates  at  each  of  the  transmission 
queues.  There  is  one  such  queue  per  directed  link  in  the  network  and  its 
arrival  rate  is  referred  to  as  the  total  flow  of  the  link.  For  a  link 
(i,k)  we  use  the  symbol  to  denote  the  corresponding  total  flow.  This 
flow  is  measured  in  data  units  per  sec  where  the  data  units  can  be  bits, 
packets,  messages,  etc.  Sometimes  it  is  meaningful  to  measure  flow  in 
units  that  are  assumed  to  be  directly  proportional  to  data  units  per  sec 
such  as  virtual  circuit  calls  traversing  the  link. 

Congestion  is  typically  measured  in  terms  of  some  function  of  the 
total  flows  F^.  For  example 


(i.k) 


Dik<Fik> 


(1) 


is  a  frequently  used  measure  of  congestion  where  represents  (an 
approximation  to)  the  average  number  of  messages  in  queue  or  under  trans¬ 
mission  at  link  (i.k)  when  the  flow  is  F^.  A  frequently  used  formula  is 


Fik 


C..-F.. 
lk  lk 


(2) 


where  is  the  transmission  capacity  of  the  (i,k)  transmission  line 
measured  in  the  same  units  as  F^.  This  is  based  on  the  hypothesis  that 
each  queue  behaves  as  an  M/M/1  queue  and  is  referred  to  as  the  Kleinrock 
independence  assumption.  While  this  hypothesis  is  almost  never  true  in 
practice  the  expression  (1) ,  (2)  represents  a  useful  measure  of  performance 
since  it  expresses  qualitatively  the  fact  that  congestion  sets  in  when  a 
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flow  F^  approaches  the  corresponding  link  capacity  C^.  Another  useful 
measure  of  congestion  is  given  by 

Fik 

max  { — p - }  (3) 

(i.k)  Lik 

i.e.  maximum  link  utilization.  A  computational  study  [3]  has  shovm  that 
it  typically  makes  little  difference  whether  the  objective  function  (1)- 
(2)  or  (3)  is  used  for  optimizing  routing.  This  is  particularly  true  for 
heavily  loaded  networks  where  computational  results  show  that  optimal 
routing  with  respect  to  one  objective  function  [(l)-(2)  or  (3)]  is  within 
very  few  (1-3)  percentage  points  of  being  optimal  with  respect  to  the 
other . 

It  is  useful  to  break  down  the  total  link  flows  into  the  portions 
that  have  common  destination.  Thus  for  a  link  (ifk)  we  denote  by  f^(j) 
the  flow  in  the  transmission  queue  (i.k)  of  data  units  destined  for  node  j. 
Clearly  we  have 

Fik  ■  l  W 

J 

Furthermore  conservation  of  flow  holds  at  each  node  in  the  form 

l  +  ri0‘)  =  l  fikO'b  V  i,  j  with  i^j  (5) 

(m,i)  (i.k) 

The  right  side  of  (5)  represents  the  total  outgoing  flow  from  node  i  that 
is  destined  for  j,  while  the  left  side  represents  the  total  incoming  flow 
into  i  from  other  nodes  that  is  destined  for  j,  plus  the  terms  r^(j)  which 
represents  flow  entering  the  network  at  node  i  and  destined  for  j .  By 
adding  (5)  for  all  destinations  j  we  see  that  there  is  also  conservation 
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of  total  flows  at  each  node  i. 

The  objective  of  the  routing  algorithms  that  we  will  consider  can 
be  loosely  stated  as  follows: 

Given  a  set  of  external  traffic  intputs  (r^(j)}  [cf.  (5)]  find  a 
"desirable"  corresponding  set  of  total  flows  {F^K 

Let  us  leave  aside  for  the  moment  the  question  of  how  we  measure 
"desirability"  of  the  set  of  total  flows  {F.^}  [i-e.  which  objective 
function  such  as  (1)  or  (3)  we  use] ,  and  concentrate  on  the  instruments 
(or  controls)  in  our  disposal  for  influencing  the  values  of  {F^}-  These 
are  called  the  routing  variables.  We  first  discuss  link  routing  variables 
and  then  consider  path  routing  variables. 

The  routing  variable  of  link  (i,k)  with  respect  to  destination  j  is 
defined  by 


»ii«> 


I  £ijn«> 
m 


(6) 


for  nodes  i  such  that  J  f.  (j)  >  0,  and  represents  the  fraction  of  flow 

m 

arriving  at  i  and  destined  for  j  which  is  routed  through  link  (i,k).  We 
have 


I  4>ik(j) 

k  1K 


1.  4>ik(j)  v  k»i«J  , 


(7) 


For  nodes  i  and  j  such  that  £  f^m(j)  =  0  an^  set  num^ers  <5>^k(j) 

m 

satisfying  (7)  can  serve  as  corresponding  link  routing  variables.  Note 
that  routing  variables  of  the  form  4>jm(j)  (i.e.  i=j)  do  not  make  sense 
and  are  not  defined. 

A  set  of  link  routing  variables  (<^k(j)},  i.e.  a  set  of  numbers  satisfy - 


I  ifMiillii— — IdlMl  il  Mi  I  mi  nil  i  ■mun  i'!* 
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ing  (?) ,  is  said  to  be  acyclic  and  destination  oriented  (ADO  for  short) 
if  the  following  condition  holds: 

There  is  no  destination  node  j  and  directed  cycle  (i.k^),  (k^.k^),..., 

(km,i)  not  containing  j  along  which  the  routing  variables  ^  (j),  4^  ^  ^ 

i^  are  a'*'1  Pos^tive- 
m 

A  little  thought  shows  that  this  condition  implies  that  given  any 
pair  of  nodes  i  and  j  there  exists  a  directed  path  {(i.kp,  (k^k^,..., 

(km,j)}  from  i  to  j  along  which  the  routing  variables  4^  (j),  4^  k  (j),..., 

&  .(j)  are  positive.  It  should  be  clear  that  in  data  networks  we  are  primarily 
interested  in  routing  variable  sets  that  are  ADO  for  otherwise  data  would 

4* 

be  allowed  to  travel  on  a  loop  with  an  obvious  inefficiency  resulting. 

Another  easily  seen  fact  is  that  a  set  of  external  traffic  inputs  (r^ (; ' } 
and  a  set  of  ADO  routing  variables  {4>^(j)}  define  uniquely  a  correspond¬ 
ing  set  of  flows  via  equations  (S)  and  (6).  Furthermore  if  r 

represents  the  vector  of  traffic  inputs  and  4>  the  vector  of  ADO  routing 
variables,  the  corresponding  vector  f  of  flows  can  be  defined  in  terms  of 
some  function  f(4>,r)  which  depends  only  on  the  topology  of  the  network. 

For  example 


r.(4)  1 

<  _ 


*„14> 


r,(4) 


^We  assume  here  implicitly  that  the  objective  function  is  a  nondecreasing 
function  of  each  total  link  flow. 
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in  the  network  shown  in  the  figure  (links  (i,k)  with  =  0  are  not  shown) 
we  have 

f12(4)  =  V4H12(4) 

f13(4)  =  rlW*l3(4) 

f23(4)  =  [r1(4)<^12(4)  4  r2(4)]023C4) 

f24(4)  =  tr1(4)<|»12(4)  +  r2(4)]4>24(4) 

f34  (4)  =  [r1(4)4>12(4)  +  r2(4)]^3(4)  +  r1(4)<j>13(4)  +  r3(4) 


Clearly  the  form  of  the  function  f C4> » r)  can  be  quite  complicated  and  non¬ 
linear  but  this  fact  does  not  cause  significant  algorithmic  difficulties. 
For  example  it  is  easy  to  construct  an  algorithm  which  for  given  $  and  r 
generates  the  corresponding  flow  vector  f(<J>,r),  and  the  corresponding  set 
of  total  flows  C4> , r) }.  We  can  then  pose  the  optimal  routing  problem 

of  finding  a  set  of  ADO  routing  variables  which  for  fixed  and  given  set 
of  inputs  (r^(j)}  minimizes  an  objective  function  of  the  total  flows  such 
as  (1)  or  (3) . 


An  alternate  but  equivalent  formulation  of  the  optimal  routing 
problem  is  obtained  by  considering  path  routing  variables  in  place  of  link 
routing  variables.  For  each  pair  w  =  (i,j)  of  distinct  nodes  i  and  j  [also 
called  an  origin-destination  (or  OD)  pair],  denote  by  the  set  of  all 
simple  directed  paths  from  i  to  j .  For  each  OD  pair  w  *  (i,j)  the  input 


r^(j),  also  written  r^,  is  to  be  divided  into  individual  path  flows  h^, 
where  peP^,  satisfying 


Given  a  set  of  path  flows  satisfying  (.8)  the  corresponding  path 


routing  variables  for  OD  pairs  w  with  rw  >  0  are  defined  by 

Sp  ■  t  <» 

r  w 

and  simply  represent  the  fractions  of  input  routed  along  the  correspond¬ 
ing  paths.  It  follows  that  path  routing  variables  satisfy 

I  S  *  u  Si0'  vpeP»- 

PCPU 

Clearly  a  set  of  path  routing  variables  together  with  a  set  of  inputs 
{r^}  defines  uniquely  a  corresponding  set  of  path  flows  via  (8) .  These 
path  flows  in  turn  define  uniquely  a  corresponding  set  of  link  flows 
obtained  by  adding,  for  each  link  and  destination  the  path  flows  that 
traverse  the  link  and  correspond  to  that  destination. 

A  conclusion  is  that  an  optimal  routing  problem  can  be  posed  where¬ 
by,  for  a  fixed  and  given  set  of  OD  pair  inputs  {r^},  we  wish  to  find  a 
set  of  path  routing  variables  which  minimizes  an  objective  function  of 
total  flows. 

It  is  important  to  realize  that  the  two  formulations  of  the  routing 

problem  in  terms  of  path  routing  variables  and  ADO  link  routing  variables 

are  equivalent.  The  reason  is  that  given  a  set  of  inputs  {r^}  and  a  set 

of  path  routing  variables  {^}  there  exists  a  set  of  ADO  link  routing 

variables  with  the  property  that  {£  }  and  generate 

identical  sets  of  link  flows.  The  set  is  unique  except  for  nodes 

i  and  destinations j  for  which  the  total  flow  J  f  ^(j)  is  zero. 

m 

The  reverse  is  not  entirely  true.  Given  a  set  of  ADO  link  routing 


variables  there  is  at  least  one  but  possibly  more  than  one  corresponding 
sets  of  path  routing  variables  that  generate  identical  sets  of  link  flows. 
Proving  these  facts  is  a  simple  and  instructive  exercise  for  the  reader. 
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3.  Implementation  by  Means  of  Routing  Variables 

We  think  of  a  routing  algorithm  as  a  process  whereby  the  set  of  rout¬ 
ing  variables  is  modified  occassional ly  according  to  some  rules.  Before 
getting  into  the  details  of  various  types  of  routing  algorithms  it  is 
worth  considering  briefly  the  practical  implementation  of  a  set  of  rout¬ 
ing  variables.  The  chief  means  for  doing  this  are  the  routing  tables 
kept  at  each  node.  At  this  point  we  must  distinguish  as  to  whether  the 
network  uses  datagrams  or  virtual  circuits. 

In  a  network  using  datagrams  each  message  or  packet  (including 
packets  of  the  same  pair  of  users)  is  routed  independently  of  the  others. 
For  the  purposes  of  routing  the  only  information  that  the  message  carries 
is  the  destination  ID  number.  Suppose  we  desire  to  implement  a  set  of  • 
link  routing  variables  {4>ik C j 3  J  in  a  datagram  network.  One  way  of  doing 
this  is  for  each  node  i  to  maintain  a  routing  table  whereby  for  each 
destination  j  and  each  outgoing  link  (i,k)  the  routing  variable  v^Cj)  is 

A 

stored  together  with  the  actual  fraction  of  the  number  of  data 

units  (messages,  bits  etc)  for  destination  j  actually  routed  along  link 
(i,k)  during  the  time  elapsed  since  the  latest  routing  variable  update. 

When  a  new  message  arrives  node  i  looks  up  its  destination  j ,  assigns  the 

/A 

message  to  the  outgoing  link  (i,k)  for  which  the  ratio  is 

largest,  and  updates  the  corresponding  fractions  There  are  other 

possible  implementations  which  may  differ  in  minor  details  but  the  idea 
is  clear.  Traffic  is  metered  to  keep  track  of  the  actual  fractions  of 
the  number  of  data  units  travelling  along  each  outgoing  link  and  the  choice 
of  route  is  designed  to  match  as  close  as  possible  the  actual  fractions 
with  the  target  fractions  given  by  the  link  routing  variables.  Each  time 
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the  link  routing  variables  change,  each  node  incorporates  the  new  values 

A 

in  the  routing  tables  and  reinitializes  the  actual  fractions  <$>^(j)  to 


~  i 

some  positive  values,  for  example  ^(j)  =  —  for  all  links  (i,k)  with 
<J>i^(j)  >  0  where  m  is  the  number  of  these  links.  Note  that  the  link 
routing  variables  actually  used  for  the  construction  of  the  routing 
tables  could  themselves  be  obtained  by  first  determining  (using  the 
"master"  routing  algorithm)  a  set  of  path  routing  variables  for  each 
OD  pair  and  then  computing  the  (essentially  unique)  corresponding  set 
of  link  routing  variables. 

In  a  network  using  virtual  circuits  all  the  messages  belonging  to 
the  same  conversation  travel  along  the  same  path  during  the  full  duration 
of  a  conversation  [By  conversation  here  we  mean  a  connection  between  two 
users  (persons  or  machines)  engaged  in  message  exchange  through  the  net¬ 
work.]  The  path  is  set  up  at  the  beginning  of  the  conversation  when  one 
of  the  two  users  requests  a  connection  with  the  other  similarly  as  for 
ordinary  telephone  calls.  Once  a  path  is  set  up  each  node  along  the 
path  keeps  in  a  table  sufficient  information  to  ensure  that  messages  of 
each  conversation  follow  the  same  route.  Routing  variables  come  into  play 
by  affecting  the  choice  of  route  at  the  beginning  of  the  conversation. 

There  are  several  ways  that  this  can  be  done. 

Suppose  first  that  path  routing  variables  }  are  available  for  each 

P 

OD  pair.  Each  node  i  keeps  a  count  of  the  number  of  virtual  circuit  calls 
that  use  each  one  of  the  paths  with  itself  as  the  origin.  It  also  maintains 
the  fractions  ^  of  the  number  of  calls  on  each  path  p  divided  by  the  total 
number  of  calls  on  paths  that  have  the  same  origin  and  destination  as 
path  p.  When  a  new  call  request  is  received  at  node  i  for  some  destination 


node  i  calculates  the  path  p  for  the  OD  pair  (i,j)  for  which  5p/£p  is 
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i 

i 

! 


largest,  assigns  the  call  on  that  path,  and  updates  the  corresponding 
fractions  The  actual  path  is  established  by  sending  along  the  path 

a  setup  packet  with  the  sequence  of  links  of  the  path  stamped  on  it.  The 

A 

fractions  are  of  course  adjusted  each  time  a  virtual  circuit  is 
terminated.  As  new  calls  are  established  and  old  calls  are  terminated 

A 

the  values  of  the  actual  fractions  Cp  drift  gradually  towards  their 
desired  values  £  specified  by  the  routing  variables  even  if  these  values 
happen  to  be  substantially  different  at  times  due  to  changes  in  £  .  It 
is  of  course  also  possible  to  change  forcibly  at  any  time  the  routes  of 

/A 

some  virtual  calls  in  order  to  make  the  actual  and  desired  fractions  £ 
and  £p  close  to  each  other  and  this  must  be  done  each  time  a  node  or  link 
fails  thereby  disrupting  some  of  the  physical  communication  paths. 

Consider  next  the  case  of  a  virtual  circuit  network  where  we  wish 
to  implement  a  set  of  ADO  link  routing  variables  {<(>^0')}.  Each  node  i 
maintains  the  fractions  4>ik(j)  of  the  number  of  virtual  circuits  passing 
through  node  i,  having  j  as  destination  and  routed  through  link  (i,k), 
divided  by  the  total  number  of  virtual  circuits  passing  through  i  and 
destined  for  j.  When  a  new  call  request  is  received  at  some  origin  node 
m  with  destination  j,  the  node  ra  sends  a  path  finding  packet  along  the 
link  k  for  which  the  ratio  i-s  largest.  When  the  path  find¬ 

ing  packet  reaches  a  new  node,  say  i,  it  is  subsequently  routed  along  the 
link  k  for  which  is  largest,  until  it  reaches  the  destination 

j.  At  this  point  the  path  of  the  new  virtual  circuit  call  will  have  been 
established.  Note  that  this  method  of  using  ADO  link  routing  variables 
is  very  similar  to  the  one  described  earlier  for  datagrams.  Indeed  we  may 
view  a  datagram  as  a  degenerate  form  of  virtual  circuit  involving  a  single 


M 
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packet  transmission.  If  this  point  of  view  is  adopted  the  link  routing 
variable  based  method  of  implementation  for  virtual  circuits  just  described 
reduces  to  the  one  described  earlier  for  datagrams. 


There  are  a  number  of  variations  on  the  implementation  methods 


of  flow  rather  than  virtual  circuits  etc.  The  main  point  to  keep 

in  mind  is  that  while  the  choice  of  virtual  circuits  versus  datagrams  and 
the  corresponding  implementation  of  the  routing  strategy  are  important 
practical  design  issues,  they  are  largely  decoupled  from  the  conceptual 
issues  of  how  one  should  choose  and  update  routing  variables,  i.e.,  how 
one  should  design  the  routing  algorithm. 
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4.  Characterization  of  Optimal  Routing  Variables 

Suppose  we  are  given  a  directed  network  with  set  of  nodes  SI  and  set  of 

links  L.  Let  W  be  a  collection  of  ordered  node  pairs  referred  to  as 

origin-destination  (OD)  pairs.  For  each  OD  pair  w£W  we  are  given  a 

positive  number  r respresenting  rate  of  input  into  the  network  from 

origin  to  destination  measured  in  data  units  per  sec.  Let  Pw  be  the  set 

of  all  simple  directed  paths  joining  the  OD  pair  w,  and  for  each  path 

peP  let  us  denote  by  h  the  flow  on  path  p  in  data  units  per  sec.  We 
w  p 

have  thus  the  constraint 


P£P 


h  =  r  ,  h  >  0,  V  peP  ,  weW 
p  w  p  —  r  w 


Cl) 


For  an  OD  pair  weW,  a  path  peP^  and  a  link  (i,k)eL  we  denote 


6  (i.k) 


1  if  path  p  contains  link  (i,k) 


otherwise 


(2) 


Then  the  total  flow  on  each  link  (i,k)eL  is  given  in  terms  of  the 
individual  path  flows  by  means  of  the  linear  expression 


Fik 


l  l  <5  Ci,k)h  . 
weW  peP  "  ** 


(3) 


In  the  remainder  of  these  notes  we  concentrate  on  an  objective 
function  of  the  form 


l 

(i: k)eL 


Dik(Fik> 


(4) 


and  the  problem  of  finding  the  set  of  path  flows  {h^}  that  minimize  this 
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objective  function  subject  to  the  constraints  (1)  and  (3).  Reference  [3] 

considers  the  problem  of  minimizing  the  maximum  link  utilization 
Fik 

max{  •= -  |(i,k)el}  by  using  algorithms  that  bear  close  relation  to  those 

lk 

used  for  minimizing  the  objective  function  (4). 

By  eliminating  the  total  flows  from  the  objective  function  (4) 
we  can  write  the  problem  as 

minimize  J  D..  [  J  l  6 (i,k)hl 
(i,k) e  1K  weW  pePw  p  p 

subject  to  J  h  =  r  ,  Y  weW  (5) 

peP  p  w 

r  w 

h^  ^  0  ,  V  pePwi  weW. 


We  assume  that  each  Dik  is  a  twice  differentiable  function  of  the 

scalar  variable  Fik  and  is  defined  in  an  interval  [0,Cik)  where  Cik  is 

either  a  positive  number  (typically  representing  the  capacity  of  the 

link)  or  else  is  +00.  The  first  and  second  derivatives  of  D^k  are  denoted 

D|k  and  DVk  and  are  assumed  strictly  positive  for  all  [0 , C^k) .  This 

implies  in  particular  that  D^k  is  a  convex  monotonically  increasing 

function  of  F.,  . 

lk 

We  wish  to  characterize  optimal  solutions  of  problem  (5)  and  then 

derive  algorithms  for  its  solution.  Note  that  an  optimal  set  of  path 

flows  (h*)  yields  immediately  a  set  of  optimal  path  routing  variables 

{£*}  via  the  formula 
P 


so  this  formulation  of  the  routing  problem  is  geared  towars  yielding 
optimal  path  routing  variables.  On  the  other  hand  we  have  seen  that 
each  set  of  path  routing  variables  yields  an  optimal  set  of  link  routing 
variables.  An  alternative  is  to  formulate  the  routing  problem  directly 
In  terms  of  link  routing  variables.  We  refer  to  the  papers  [1]  and  [2] 
for  a  presentation  of  possibilities  along  these  lines. 

A  characterization  of  optimal  solutions  of  the  routing  problem  (5) 
is  obtained  by  specializing  the  following  general  necessary  and  sufficient 
condition  for  optimality: 

Lemma:  Let  f:  Rn  -*■  R  be  a  differentiable  convex  function  on  the  n- 
dimensional  Euclidean  space  Rn,  and  let  X  be  a  convex  subset  of  Rn. 

Then  x*eX  is  an  optimal  solution  of  the  problem 

minimize  f(x)  (6) 

subject  to  xeX 

if  and  only  if 

Vf(x*)T(x-x*)  >0,  V  xeX,  (71 

where  Vf(x*)  is  the  gradient  vector  of  f  at  x*  and  superscript  T  denotes 
transpose. 

Proof.  Assume  x*  is  an  optimal  solution  of  (6) ,  and  for  every  xeX  con¬ 
sider  the  function  g(a)  =  f[x*  +a(x-x*}]  of  the  scalar  variable  a.  Then 

g (a)  attains  a  minimum  at  o  =  0  over  the  inverval  [0,11  so  -f ^  >  0. 

da  — 

But  =  Vf(x*)’(x-x*)  (using  the  chain  rule)  so  (7)  is  proved. 

Conversely  assume  that  (7)  holds  and  x*  is  not  an  optimal  solution 
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of  (6).  We  will  arrive  at  a  contradiction.  Indeed  let  xeX  be  such  that 
f(x)  <  f(x‘)  and  consider  the  function  g(ot)  =  f[x*  +  a(x-x*)].  Then 

_>  0  [by  (7)]  while  f(x*)  =  g(0)  >  g(l)  =  f(x).  A  little  thought 
or  an  elementary  argument  shows  that  these  conditions  contradict  the 
convexity  of  g Cot)  over  [0,1]  and  hence  also  the  convexity  of  f.  Q.E.D. 


We  now  apply  the  lemma  to  problem  (5) .  The  lemma  is  applicable  since 
both  the  objective  function  and  the  constraint  set  of  (5)  are  convex. 

If  h  denotes  the  vector  of  the  path  flows  hp,  D(h)  denotes  the  objective 


function  of  problem  (5)  and 


3D  00 
3h_ 


denotes  the  partial  derivative  of  D 


wit&  respect  to  h  we  see  that 
P 


3D(h) 

% 


l 

(i.k)ep 


C8) 


where  the  derivatives  D!^  are  evaluated  at  the  total  flows  corresponding 
to  h.  From  (8)  we  see  that  3D/3hp  is  the  length  of  the  path  p  when  length 
of  each  link  (i,k)  is  taken  to  be  the  first  derivative  evaluated  at 
h.  According  to  the  lemma  {h*}  is  an  optimal  set  of  path  flows  if  it 
satisfies  the  constraints  of  problem  (5)  and  condition  (7)  is  satisfied. 

By  using  (8) ,  condition  (7)  can  be  written  as 


l 

weP 

w 


I 

P£Pw 


d*  (h  -h*) 
P  P  P 


>  0 


for  all 


h 


P 


satisfying  the  constraints 


(9) 


I  hD  =  V  h  >0,  V  pePw,  weW, 

pePw  P  F 


where  d*  is  the  1st  derivative  length  of  the  path  p  given  by 


Conditions  (9)  and  (10)  can  be  clearly  decoupled  with  respect  to  OD 
pair  and  written  for  each  weW  as 


T  d*(h  -h*)  >  0,  V  h  >0,  peP  with  T  h  =  r  .  (12) 

~~  P  P  P  -  P  “  v  «  pep  P  w 


Pep, 


It  is  easily  seen  (argue  by  contradiction)  that  this  condition  is  equivalent 
to  having  for  all  weW 


h*  >  0 

P 


d* 

P 


min  {d^_} 
P£P„  P 


(13) 


Equivalently  we  have  that  a  set  of  path  flows  is  optimal  if  and  only  if’ 
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5.  Shortest  Path  Routing  and  the  Frank-Wolfe  Method 

We  have  seen  that  optimal  routing  results  only  if  flow  travels  along 
minimum  first  derivative  length  (MFDL  for  short)  for  each  OD  pair.  Equi¬ 
valently  a  routing  (i.e.  a  set  of  routing  variables)  is  strictly  sup- 
optimal  only  if  there  is  a  positive  amount  of  path  flow  that  travels  on  a 
non  MFDL  path.  This  suggests  that  suboptimal  routing  can  be  improved  by 
shifting  flow  to  an  MFDL  path  from  other  paths  for  each  OD  pair.  Indeed 
this  can  be  shown  mathematically  by  observing  that  if  h  =  {h^}  is  a  set 
feasible  path  flows  and  Ah  “{Ah^}  is  a  corresponding  direction  for  chang¬ 
ing  h  then  the  function  of  the  scalar  a  given  by 


G(a)  =  D(h  +  aAh) 


(1) 


has  first  derivative 


dG(q) 

da 


a=0 


l 

w£W 

I 

weW 


l 

PePw 


I 

P£PW 


30(h) 

3hp 


(2) 


where  d^  is  the  first  derivative  length  of  the  path  p  (evaluated  at  the 
link  flows  corresponding  to  h) .  Therefore  if  Ahp  is  positive  for  MFDL 
paths  and  negative  for  all  other  paths  while  maintaining  the  conservation 
of  OD  pair  input  flow  equation 


1 

P£Pw 


0 


h  +  Ah  >  0 
P 


V  weW, 

V  pePw,  weW 


(3) 
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we  will  have 


<  o 

da  a=0 


(4) 


which  means  that  the  objective  function  will  be  reduced  by  a  small 
motion  in  the  direction  Ah. 

The  preceding  discussion  suggests  the  following  iterative  algo¬ 
rithm: 

Given  h  =  {h^}  find  a  MFDL  path  for  each  OD  pairs.  Let  h  =  (h  } 
be  the  set  of  path  flows  that  would  result  if  all  input  r^  for  each  OD 
pair  weW  is  routed  along  the  corresponding  MFDL  path.  Let  a*  be  the 
stepsize  that  minimizes  D[h  +  a(h-h)]  over  all  ote[0,l],  i.e. 


D[h  +  a*(h-h)  ]  =  min  D[h  +  a(h-h)].  (5) 

ae[0,l] 

The  new  set  of  path  flows  is  obtained  by 


h  «-  h  +  a*(h-h) 


(6) 


and  the  process  is  repeated. 

This  algorithm  is  a  special  case  of  the  so  called  Frank-Wolfe 
method  for  solving  general  nonlinear  programming  problems  with  convex 
constraint  sets  (see  [4], [5]).  It  has  been  called  the  flow  deviation 
method  (see  [6]),  and  can  be  shown  to  reduce  the  value  of  the  objective 
function  to  its  minimum  in  the  limit  although  its  convergence  rate 
near  the  optimum  tends  to  be  very  slow.  Proving  convergence  depends 
on  selecting  a  proper  value  for  the  stepsize  a.  The  determination  of  an 
optimal  stepsize  a*  satisfying  (5)  requires  a  one-dimensional  mini- 
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mization  over  [0,1]  which  can  be  carried  out  through  any  one  of  several 
existing  algorithms.  However  finding  a*  constitutes  an  iterative  process 
which  makes  the  algorithm  impossible  to  implement  in  a  distributed 
manner.  A  simpler  method  is  to  choose  the  stepsize  a*  in  (6)  by  means 
of  the  formula 


l  Dlk^ik-V 

a*  =  min[l,  -  - = - =-  ] 

l  Dik<Fik-Fik> 

(i»k) 


(7) 


where  {F^l  and  {F.^}  are  the  sets  of  total  link  flows  corresponding 
to  (hp)  and  {hp}  respectively,  and  the  first  and  second  derivatives 
Dik’  Dik  are  evaluate<i  at  F^k'  The  formula  (7)  for  a*  is  obtained  by 
making  a  second  order  Taylor  series  expansion  G(a)  of  G(a)  =  D[h  +  a(K-h)] 
around  a  =  0 


G(C°  ’  »!k)  lDik'Fik)  *  “  Dik(Fik)(Fik‘Fik) 


*  T  °ik<Fik><fik-<V2) 


and  minimizing  G(a)  with  respect  to  a  over  the  interval  [0,1]. 

It  can  be  shown  that  the  Frank-Wolfe  algorithm  (6)  with  the  choice 
(7)  for  the  stepsize  converges  to  the  optimal  set  of  total  link  flows 
provided  the  starting  set  of  total  link  flows  is  sufficiently  close 
to  the  optimal.  For  the  type  of  objective  functions  used  in  rout¬ 
ing  problems  it  appears  that  the  stepsize  choice  (7)  typically  leads 
to  convergence  even  when  the  starting  total  link  flows  are  far  from 
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optimal . 

Aside  from  its  simplicity  the  stepsize  rule  (7)  has  the  advantage 
that  it  can  be  implemented  in  a  distributed  way  hy  means  of  a  scheme 
such  as  the  one  described  below: 

Each  node  i  broadcasts  the  current  value  of  for  all  of  its 
outgoing  links  (i,k)  to  all  other  nodes.  (This  can  be  done  by  flood¬ 
ing  or  through  a  spanning  tree).  Each  node  calculates  D!^(F^)  and 
Dik^ik^  ^or  1^n'cs  (i»k)  and  computes  an  MFDL  path  for  each  OD 
pair  w  for  which  it  is  the  origin.  It  then  sends  the  value  rw  along 
the  MFDL  path.  The  head  node  of  each  link  (i,k)  adds  up  the  inputs 
r^  for  all  the  MFDL  paths  that  go  through  it,  computes  the  total  flow 
F^k  and  then  broadcasts  the  values  of  (F^-F^)  t0  all  other  nodes. 

All  nodes  then  can  compute  the  stepsize  a*  of  (7)  and  compute  the 
required  change  in  path  flows 


h  +  a* 
P 


(h. 


V 


and  corresponding  change  in  the  path  routing  variables.  The  scheme 

requires  two  messages  (F^,  and  F^)  per  link  to  be  broadcast  to  all 

nodes  and  one  message  per  OD  pair  (rw)  to  be  sent  to  every  node  along 

the  corresponding  MFDL  path.  The  communication  complexity  per  iteration 

is  O(LN')  +  Q(N^)  if  a  spanning  tree  is  employed  for  broadcasting 
2  3 

and  0(L  )  +  0(N  )  if  flooding  is  employed  where  N  and  L  is  the  number 
of  nodes  and  links  respectively.  We  will  describe  in  the  next  section 
other  distributed  optimal  routing  algorithms  with  better  communication 
complexity  per  iteration  and  a  typically  better  rate  of  convergence 
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than  the  Frank-Wolfe  method. 

There  are  several  shortest  path  routing  algorithms  used  in 
practice  (see  [7])  that  resemble  to  some  extent  the  Frank-Wolfe  method 
although  they  fail  to  achieve  optimality  in  any  identifiable  sense  and 
in  some  cases  they  don’t  even  come  close  to  doing  so.  Their  general 
form  is  as  follows: 

(SP)  At  discrete  times  an  MFDL  path  is  computed  for  each  OD  pair  and 
all  new  traffic  (datagrams  or  virtual  circuits)  generated  in  the 
intervening  time  period  is  routed  along  these  MFDL  paths. 

The  scheme  above  presupposes  link  lengths  that  are  flow 
dependent  and  represent  first  derivatives  of  some  other  functions. 
Several  shortest  path  routing  algorithms  used  in  practice  employ  link 
lengths  that  depend  in  a  crude  (and  discontinuous)  manner  on  the  flow 
traversing  the  link.  In  some  cases  link  lengths  are  taken  to  be  con¬ 
stant  (which  corresponds  to  linear  functions  D^k)  and  change  only  if 
the  link  fails  in  which  case  its  length  is  set  to  essentially  +  <*>) . 

The  ARPANET  algorithm  [8]  uses  as  link  length  a  time  average  of  packet 
delay  in  traversing  the  link  during  the  preceding  time  period. 

The  performance  of  algorithm  (SP)  strongly  depends  on  the  choice 
of  the  link  function  D^k  and  its  first  derivative  D|k,  on  the  frequency 
of  routing  variable  updates,  and  on  the  rate  at  which  new  traffic  is 
generated  in  the  network.  If  datagrams  are  used  exclusively  in  the 
network,  algorithm  (SP)  cannot  possibly  provide  optimal  or  near 
optimal  routing.  Since  there  is  no  restriction  for  each  datagram  of 
a  conversation  to  follow  the  same  path  as  a  previous  datagram,  algorithm 
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(SP)  induces  a  very  abrupt  shift  of  flow  when  a  currently  used  MFDL  path  is  changed. 

As  a  result,  at  any  given  time,  each  OD  pair  communicates  along  a 
single  path,  and  this  is  inconsistent  with  optimal  routing  where  it 
is  typically  necessary  to  bifurcate  flow  at  strategic  points  in  order 
to  avoid  overloading  some  portions  of  the  network  relative  to  others. 

Furthermore  shortest  path  routing  in  datagram  networks  can  exhibit 
an  oscillatory  behavior  whereby  not  only  the  MFDL  paths  change 
frequently  but  also  an  unfortunate  tendency  is  exhibited  by  the  algo¬ 
rithm  to  select  shortest  paths  that  are  progressively  worse  with 
respect  to  any  global  congestion  measure.  An  explanation  and  analysis  ■ 
of  this  phenomenon  is  given  in  [9] . 

Algorithm  (SP)  tends  to  work  somewhat  better  in  virtual  circuit 
networks  assuming  that  whenever  an  MFDL  path  update  is  made  the  virtual 
circuits  in  use  are  not  switched  over  to  the  new  path  but  continue 
using  the  same  path  as  before.  This  in  effect  implies  a  gradual 
switch  of  traffic  from  the  old  MFDL  path  to  the  new  one  which  may  be 
viewed  as  an  implementation  of  the  Frank-Wolfe  method.  The  amount  of 
flow  shift  from  the  old  MFDL  paths  to  the  new  one  corresponds  to  the 
stepsize  used  in  the  Frank-Wolfe  method  and  basically  depends  on  two 
factors : 

a)  The  rate  at  which  old  conversations  terminate  and  new  conversations 
are  generated  and 

b)  The  time  interval  between  MFDL  path  updates. 


L 


A 
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It  can  be  shown  (as  yet  unpublished  work)  that  this  routing  method 
tends  to  provide  a  sequence  of  routings  that  converges  (rather  slow¬ 
ly)  to  a  neighborhood  of  the  optimum  and  then  oscillates  within  that 
neighborhood.  The  size  of  the  neighborhood  depends  on  the  (effective) 
stepsize  of  the  corresponding  Frank-Wolfe  method.  As  the  stepsize 
decreases  (slower  rate  of  generation  of  new  conversations,  and  faster 
MFDL  path  updates),  the  neighborhood  becomes  smaller. 

In  conclusion  it  may  be  said  that  shortest  path  routing  bears 
some  relation  to  optimal  routing  and  the  Frank-Wolfe  method  but  it 
is  often  practiced  in  a  way  that  can  result  in  far  from  optimal 
performance.  It  makes  more  sense  in  virtual  circuit  networks  but 
even  for  such  networks  its  convergence  to  a  neighborhood  of  an  optimal 
solution  tends  to  be  slow. 


6.  Projection  Methods  for  Optimal  Routing 


Methods  in  this  category  are  also  based  on  shortest  paths  and 
determine  an  MFDL  path  for  every  OD  pair  at  each  iteration.  An 
increment  of  flow  change  is  calculated  for  each  path  on  the  basis  of 
the  relative  magnitudes  of  the  path  lengths  and,  sometimes,  second 
derivatives  of  the  objective  function.  If  some  path  flow  becomes 
negative  on  the  basis  of  the  corresponding  flow  increment  it  is 
simply  set  to  zero,  i.e.  it  is  "projected"  back  onto  the  positive 
orthant.  There  are  several  methods  of  this  type  that  are  of  interest 
in  connection  with  the  routing  problem.  They  may  all  be  viewed  as 
constrained  versions  of  common  unconstrained  optimization  methods 
such  as  steepest  descent  and  Newton's  method  extensive  accounts  of 
which  may  be  found  in  any  text  on  nonlinear  programming,  e.g.  [4],  [5], 
[13] .  In  what  follows  we  describe  briefly  these  methods  in  a  general 
nonlinear  optimization  setting  and  subsequently  specialize  them  to 
the  routing  problem. 

Let  f:  Rn  +  R  be  a  twice  continuously  differentiable  convex 

function  with  gradient  at  any  xeRn  denoted  Vf(x)  and  Hessian  matrix 
2 

V  f(x)  assumed  positive  definite  for  all  x.  The  method  of  steepest 
descent  for  finding  an  unconstrained  minimum  of  f  is  given  by  the 
iteration 

Vi  =  *k  '  ®k  7f(xk3  ’  k  =  °’1""  (1) 

where  a  is  a  positive  scalar  steps ize  determined  according  to  some 
k 

rule.  Common  choices  for  cl  are  the  minimizing  steps ize  determined 
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by 

'  “k  7f(xk^  =  mijj  ftxk  ‘  (2) 

and  a  constant  positive  stepsize  a 


\  =  <*  *  v  k- 


(3) 


There  are  a  number  of  convergence  results  relating  to  method  (1)  with 
stepsize  choices  (2)  or  (3).  For  example  if  f  has  a  unique  uncon¬ 
strained  minimizing  point  it  may  be  shown  that  the  sequence  {x^}  gen¬ 
erated  by  (1),  (2)  converges  to  this  minimizing  point  for  every  start¬ 
ing  xQ.  Also  given  any  starting  vector  xQ,  the  sequence  generated 
by  (1),  (3)  converges  to  the  minimizing  point  provided  a  is  chosen 
sufficiently  small.  Unfortunately  however  the  speed  of  convergence 
of  {x^}  can  be  quite  slow.  It  can  be  shown  [5],  [13]  that  for  the  case 
of  the  line  minimization  rule  (2)  if  f  is  a  positive  definite 
quadratic  function 

f  (x)  =  j  xTQx  -  bTx, 

where  Q  is  a  positive  definite  symmetric  nxn  matrix  and  b  is  a  given 
vector,  then  there  holds 


- 


,M-nk  2 


,  f*  =  mjLn  f(x) 


(4) 


where  M  and  m  are  the  largest  and  smaJlest  eigenvalues  of  Q  respective¬ 
ly.  Furthermore  there  exist  starting  points  x^  such  that  (4)  holds 
with  equality  for  every  k.  So  if  the  ratio  M/m  is  large  (this  cor- 
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responds  to  the  level  sets  of  f  being  very  elongated  ellipses) ,  the 
rate  of  convergence  is  slow.  A  similar  result  can  be  shown  for  the 
method  (1)  and  (3)  and  these  results  can  be  shown  to  hold  in  a 
qualitatively  similar  form  for  general  convex  twice  continuously 
differentiable  functions  f  with  everywhere  positive  definite  Hessian 
matrix . 

The  rate  of  convergence  of  the  steepest  descent  method  can  be 
improved  by  premultiplying  the  gradient  by  a  suitable  positive  definite 
scaling  matrix  thereby  obtaining  the  iteration 

*k+l  ~  \  ’  k  = 

From  the  point  of  view  of  rate  of  convergence  the  best  method  is 
obtained  with  the  choice 

Dk  =  (6) 

This  is  Newton's  method  which  can  be  shown  to  possess  a  very  fast 
(quadratic)  speed  of  congergence  near  the  minimizing  point.  Un¬ 
fortunately  this  excellent  speed  of  convergence  is  achieved  at  the 
expense  of  the  potentially  substantial  overhead  associated  with  the 
inversion  operation  in  (6) .  It  is  often  useful  to  consider  other 
choices  of  which  approximate  the  "optimal"  choice  [V^f(x^)]  1 
but  do  not  require  as  much  computation  overhead.  A  choice  that  often 
works  well  is  to  choose  to  be  a  diagonal  approximation  to  the 
inverse  Hessian,  i.e. 
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a2f(V 


-l 


,  L2 
(&*  ) 


32fUk) 

,  2,2 

(3X  ) 


-1 


0 


(7) 


0 


*  2  -1 

3  fC^) 

~  n,  2 
(&x  ) 


With  this  choice  the  method  (5)  can  be  written  in  the  simple  form 


\  '  \ 


32f(xk) 

7  rr- 

(3«  ) 


3f(xJc) 


3* 


k  =  0,1,-..  (8) 

i  =  1, . . .n. 


Consider  now  the  problem  of  minimizing  the  convex  twice  con¬ 
tinuously  differentiable  function  f:  Rn  -*■  R  subject  to  the  nonnegativity 
constraints  x1  ^  0,  i  =  l,...,n,  i.e.  the  problem 

minimize  f(x) 

subject  to  x  >  0.  (9) 

A  straightforward  analog  of  the  steepest  descent  method  (1)  is  given  by 

*Tc+l  =  ^xk  '  \Vf(xk^  +  ’  k  =  (10) 

where  for  any  vector  zeRn,  we  denote  by  [z]+  the  projection  of  z  onto 
the  positive  orthant 
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max  {0,z*} 
2 

max{0,z  } 

max{0,zn} 


(11) 


It  can  be  shown  [14]  that  the  convergence  results  mentioned  earlier 
in  connection  with  the  unconstrained  steepest  descent  method  (1)  also 
hold  true  for  the  constrained  analog  (10) .  The  same  is  true  for  the 
method 

Vi  ■  [xk  -  (12) 

where  is  a  diagonal  positive  definite  scaling  matrix  such  as  (7)  and 
for  other  rules  of  stepsize  selection.  While  the  assumption  that  is 
diagonal  is  essential  for  the  validity  of  iteration  (12) ,  there  are 
modified  versions  of  (12)  in  which  D^.  is  chosen  nondiagonal  on  the  basis  of 
the  second  derivatives  of  f  and  for  which  the  fast  convergence  rate  of 
Newton's  method  is  realized.  We  will  not  consider  these  methods  in 
these  notes  and  we  refer  the  reader  to  [16]  and  [15]  for  related 
description  and  analysis,  as  well  as  application  to  the  routing  problem. 

In  what  follows  we  concentrate  on  the  application  of  the  simple  method 
(12)  to  the  routing  problem  for  chosen  to  be  a  diagonal  approximation 
to  the  inverse  Hessian  matrix. 

Consider  the  routing  problem  in  terms  of  path  flows 

minimize  T  D..  [  £  T  <5  (i,k)h  ]  ^  D(h)  (13) 

(i.k)  1K  weW  pePw  p  p 

subject  to  J  h  =r  ,  h  ^  0,  V  weW,  peP 

peP  p  p  w 

r  w 
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Assume  that  after  k  iterations  we  have  a  feasible  set  of  path  flows 
k  k 

{hp},  and  let  {F^}  be  the  corresponding  set  of  total  link  flows.  For 

each  OD  pair  w  let  pw  be  an  MFDL  path  [with  respect  to  link  lengths 
k 

Dim^inP  ] .  We  can  convert  problem  (13)  (for  the  purpose  of  the  next 
iteration)  to  a  problem  involving  only  positivity  constraints  by  ex¬ 
pressing  the  flows  of  the  MFDL  paths  pw  in  terms  of  the  other  path 

flows  while  eliminating  the  equality  constraints  T  h  =  r  in  the 

pd>  P  W 
r  w 

process.  Thus  we  write  for  each  weW 


-  I 

peP 


(14) 


and  substitute  h—  in  the  objective  function  D(h)  thereby  obtaining 
Pw 

a  problem  of  the  form 


minimize  D(h) 

subject  to  hp  >  0,  V  weW,  peP^,  P  ^  Pw 


(15) 


where  h  is  the  vector  of  all  path  flows  which  are  not  MFDL  paths.  The 
objective  D(h)  is  obtained  from  D(h)  once  the  MFDL  path  flows  h—  , 


weW  are  substituted  by  their  expressions  (14)  in  terms  of  the  other 
path  flows.  Clearly  we  have 


3D (hk) 

3hp 


3D(hk) 

3hp 


3D(hk) 

3K? 

*  u 


'  V  P£Pw'  P  *  Pw 


(16) 


for  all  weW.  We  have  already  seen  in  the  previous  section  that  dPfo?  is 

P 

the  first  derivative  length  of  path  p,  i.e. 
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(i.m)ep 


D!  (F.  ) 
imv  im 


Since  pw  is  an  MFDL  path  we  have  from  (16) 


ieir1  2  0.  V  «».  PeP„.  P  *  Pu 


Regarding  second  derivatives,  a  straightforward  differentiation 
of  the  expression  (16),  (17)  for  the  first  derivative  shows  that 


2  ~  Tc 

T\  /l. 


^  ^  (Fjj  ,  V  wdV, 


OV 


(i,m) £l 


PeP„,  P  *  P„ 


where,  for  each  p,  is  the  set  of  links  that  belong  to  either  the  path 

p,  or  the  corresponding  MFDL  path  p^  but  not  both. 

We  now  have  available  expressions  for  both  first  and  second 
derivatives  of  the  "reduced"  objective  function  D(h)  and  thus  we  can 
apply  the  projection  method  (12)  with  the  diagonal  approximation  of  the 
inverse  Hessian  as  scaling  matrix.  The  iteration  takes  the  form 

hk+1  =  max  (0,  hk  -  a,  L~*  (d  -d—  )}  V  weW,  peP  ,  p  /  p  , 
p  P  *  P  P  P  r  w  r  rw’ 


where  d  and  d—  are  the  first  derivative  lengths  of  the  paths  p  and 

_  P  Pw 

pw  given  by  [cf.  (17)] 

d  =  l  D*  (F*),  d-  =  l  D!  (F*  ) 

p  t  *  n  _  im  im  p  »■  •  %  im  im 

p  (i,m)ep  Fw  (i,m)ep 


and  L  is  the  "second  derivative  length' 
P 

L  =  y  D"  (F*  ) 

v  (l.m)eLp 


(22) 
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given  by  (19) . 

The  stepsize  is  some  positive  scalar  which  may  be  chosen  by  a 
variety  of  methods.  For  example  a^,  could  be  chosen  constant  or  by  some 
form  of  line  minimization.  More  about  stepsize  selection  will  be  said 
later. 

The  following  observations  can  be  made  regarding  iteration  (20) : 

a)  Since  for  each  OD  pair  weWwehave  d  d—  for  all  p  ^  p  it 

_  ?  ?w  w 

follows  that  all  path  flows  h^,  p  ^  p^  which  are  positive  will  be  re¬ 
duced  with  the  corresponding  increment  of  flow  being  shifted  to  the 
MFDL  path  p^. 

b)  Those  path  flows  h  ,  p  t  p  which  are  zero  will  stay  at  zero. 

p  w 

Therefore  the  calculation  indicated  in  (20)  should  only  be  carried  out  . 
for  paths  that  carry  positive  flow. 

c)  Only  paths  that  carried  positive  flow  at  the  starting  flow 
pattern  or  were  MFDL  paths  at  some  previous  iteration  can  carry  positive 
flow  at  the  beginning  of  any  single  iteration.  This  is  important  in 
that  it  tends  to  keep  the  number  of  paths  that  carry  positive  flow  small 
with  a  corresponding  reduction  in  the  amount  of  calculation  and  bookkeep¬ 
ing  needed  at  each  iteration. 


Regarding  the  choice  of  the  stepsize  there  are  several  pos¬ 
sibilities.  It  is  possible  to  select  cc^  to  be  constant  (a^  Ha,  V  k) , 
and  with  this  choice  it  can  be  shown  (the  proof  is  essentially  given  in 
[17])  that  given  any  starting  set  of  path  flows  there  exists  a  >  0  such 


that  if  for  all  k  we  have  0  <  ot^  <_  a  then  a  sequence  generated  by 
iteration  (20) -(22)  converges  to  the  optimal  value  of  the  problem.  A 


crucial  question  has  to  do  with  the  magnitude  of  the  constant  stepsize. 

It  is  known  from  nonlinear  programming  experience  and  analysis  that  a 
stepsize  equal  to  unity  usually  works  well  with  Newton's  method  as  well 
as  approximations  to  Newton's  method  that  employ  scaling  based  on 
second  derivatives  [5],  [13].  Experience  has  verified  that  a  choice 
of  in  (20)  near  unity  typically  works  quite  well  in  iteration  (20) 
regardless  of  the  values  of  the  input  traffic  pattern  {rw>.  Even 
better  performance  with  unity  stepsize  is  usually  obtained  if 
iteration  (20)  is  carried  out  one  0D  pair  (or  one  origin)  at  a  time, 
i.e.  first  carry  out  (20)  with  0^=1  for  a  single  OD  pair  (or  origin)  ad¬ 
just  the  corresponding  total  link  flows  to  account  for  the  effected  change 
in  the  path  flows  of  this  OD  pair  (or  origin) ,  and  then  carry  out  (20)  with 
ot^=l  for  the  next  OD  pair  (or  origin)  until  all  path  flows  are  taken  up 

cyclically.  The  rationale  for  this  is  based  on  the  fact  that  by  dropping  the 
off-diagonal  terms  of  the  Hessian  matrix  [cf.  (5),  (7)]  we  are  in  effect 

neglecting  the  interaction  between  the  flows  of  different  OD  pairs.  In 
other  words  iteration  (20)  is  based  to  some  extent  on  the  premise  that 
each  OD  pair  will  adjust  its  own  path  flows  while  the  other  0D  pairs 
will  keep  theirs  unchanged.  By  carrying  out  (20)  one  OD  pair  at  a  time 
we  can  reduce  the  potentially  deterimental  effect  of  the  neglected  off- 
diagonal  terras  of  the  Hessian  and  increase  the  likelihood  that  the  unity 
stepsize  is  appropriate  and  effective.  Under  these  circumstances  iteration 
(20)  works  well  with  a  unity  stepsize  for  almost  ail  networks  and  traffic 
input  patterns  likely  to  be  encountered  in  practice. 

Another  possibility,  which  is  better  suited  for  a  centralized 


implementation  is  to  select  by  a  simple  form  of  line  search  in 

equation  (20).  Thus  let  (F^k  I  be  the  set  of  link  flows  corresponding 

to  {hp}  and  let  {F^}  be  the  set  of  link  flows  corresponding  to  the 

set  {FT  }  given  by  [cf.  (20)  with  a.  =  1] 

P  K 

N  =  hp  "  Lp1(dP_dP  V  W£W’  PePw*  P  *  Pw 


=  r 


-  V  h 
w  p 


The  stepsize  in  (20)  is  chosen  to  minimize  the  2nd  order  Taylor 
series  expansion  of  the  objective  along  the  line  segment  connecting 
{F.k}  and  {F.k},  i.e.  [cf.  (7)] 


\ 


(i^k)  Pik*(Fik~Fikj 
(iJO  Dik^ik"Fik^ 


The  algorithm  (20)  described  above  typically  yields  rapid  con¬ 
vergence  to  a  neighborhood  of  an  optimal  solution.  Once  it  comes  near 
a  solution  (how  "near"  is  "near"  depends  on  the  problem)  it  tends  to 
slow  down.  Its  progress  is  often  satisfactory  near  a  solution  and  in 
any  case  far  better  than  that  of  the  Frank-Wolfe  method. 

In  order  for  one  to  obtain  fast  convergence  near  a  solution  (and 

therefore  also  an  accurate  approximation  to  an  optimal  solution  in  a 
reasonable  amount  of  time)  it  is  necessary  to  take  fully  into  account  the 

off-diagonal  terms  of  the  Hessian  matrix  and  introduce  some  form  of  line 
search  for  finding  a  proper  stepsize.  Surprisingly  it  is  possible  to 
implement  sophisticated 


methods  of  this  type  (see  [15]]  although  we  will  not  go  into  this 
further.  We  only  mention  that  these  more  sophisticated  methods  are 
based  on  a  more  accurate  approximation  of  a  constrained  version  of 
Newton's  method  (using  the  conjugate  gradient  method]  and  attain  the 
very  fast  rate  of  convergence  of  Newton's  method  near  an  optimal 
solution.  However  when  far  from  a  solution  their  speed  of  convergence 
is  usually  only  slightly  superior  to  that  of  iteration  (20).  So  if 
one  is  only  interested  in  getting  fast  near  an  optimal  solution  but 
the  subsequent  rate  of  progress  is  of  little  importance  (as  is  typical¬ 
ly  the  case  in  practical  routing  problems)  the  simple  iteration  (20) 
is  usually  fully  satisfactory. 

We  now  illustrate  the  algorithm  (20) -(22)  by  means  of  an  example: 
Example:  Consider  the  network  shown  in  the  figure  below: 

r,  =  4  r2*8 


There  are  only  two  0D  pairs  (1,5)  and  (2,5)  with  corresponding  inputs 
rl  =  4,  ^  -  8  as  shown  in  the  figure.  We  consider  the  following 
two  paths  for  each  0D  pair: 


-40- 


Paths  of  OP  Pair  (1,5): 

PjCl)  =  {(1,4), (4,5)} 

p2(l)  =  {(1,3),  (3, 4), (4, 5)} 

Paths  of  OP  Pair  (2,5): 

Pl(2)  =  {(2,4),  (4,5)} 

p2(2)  =  {(2,3),  (3,4),  (4,5)} 


Consider  the  instance  of  the  routing  problem  (13)  where  the  link 
objective  functions  are  all  identical  and  given  by 

Dik(Fik>  *  7  <Fik)2  ■  v 


Consider  an  initial  path  flow  pattern  whereby  each  OD  pair  input  is 
divided  equally  between  the  two  available  paths.  This  results  in  a 
flow  distribution  given  in  the  following  tables: 


OD  Pair 

Path 

Path  Flow 

(1,5) 

P^D 

2 

P2(l) 

2 

(2,5) 

Pj(2) 

4 

P2(2) 

4 

Table  1 


Link 

Total  Link  Flow 

(1,3) 

2 

(1,4) 

2 

(2,3) 

4 

(2,4) 

4 

(3,4) 

6 

(4,5) 

12 

others 

0 

Table  2 


The  first  derivative  length  of  each  link  is  given  by 


Dik<Fik>  "  Fik 
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so  the  total  link  flows  given  in  Table  2  are  also  the  link  lengths 
for  the  current  iteration.  The  corresponding  first  derivative  lengths 
of  paths  are  given  in  the  following  table: 


Table  3 


jf.  Therefore  the  shortest  paths  for  the  current  iteration  are  Pj(l)^ 

^  and  px(2)^  for  OD  pairs  (1,5)  and  (2,5)  respectively. 

We  now  show  the  form  of  iteration  (20) -(22)  for  each  of  the  OD 
pairs : 

if  OD  Pair  (1,5)  :  Here  for  the  nonshortest  path  p  =  p^(l)  and  the 

X.  shortest  path  p  =  p^(l)  we  have  d  =  20,  d—  =  14.  We  also  have 

w  P  ^w 

Lp  =  3  [each  link  has  second  derivative  length  DV^  =  1  and  there  are 
three  links  that  belong  to  either  Pj(l)  or  p2(l)  but  not  to  both 
X  (Vjtf.  (19), (22)].  Therefore  iteration  (20)  takes  the  form 


h  raax{0,  2  -  j  (20-14)} 


and 


a  max{0,  2#  -  20^} 
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so  the  total  link  flows  given  in  Table  2  are  also  the  link  lengths 
for  the  current  iteration.  The  corresponding  first  derivative  lengths 
of  paths  are  given  in  the  following  table: 


Table  3 


Therefore  the  shortest  paths  for  the  current  iteration  are  p^  Cl) 
and  p1 (2)  for  OD  pairs  (1,5)  and  (2,S)  respectively. 

We  now  show  the  form  of  iteration  (20) -(22)  for  each  of  the  OD 


pairs : 

OD  Pair  (1,5):  Here  for  the  nonshortest  path  p  *  p.,(l)  and  the 

shortest  path  p  =  p1 (1)  we  have  d  =  20,  d—  =  14.  We  also  have 
w  P  ^w 

Lp  =  3  [each  link  has  second  derivative  length  =  1  and  there  are 
three  links  that  belong  to  either  Pj(l)  or  P2U)  but  not  to  both 
cf.  (19),  (22)].  Therefore  iteration  (20)  takes  the  form 


hp  •*-  max{0,  2  -  y  (20-14)} 


and 


max{0,  2  •  -  2ot^} 
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4  -  max{0,  2  -  20^} 


OP  Pair  (2,5):  Here  for  the  nonshortest  path  p  = 

shortest  path  p  =  p,(2)  we  have  d  =  22  and  d—  = 

r  p 

L  =  3  and  iteration  (20)  takes  the  form 
P 

hp  «■  max{0,  4  -  j(22-16)} 


and 


=  max{0,  4  -  2oi^ } 


=  8 


-  max{0 ,  4  -  20^} 


More  generally  let  1^(1),  h2(l),  1^(2),  h2(2) 
along  the  paths  PjU),  P2(l),  P1(2),  P2(2), 


beginning  of 

the 

iteration.  The  corresponding 

follows: 

dpl(1) 

= 

hl 0-)  +  rl  *  r2 

dP2(l) 

= 

2h2(l)  +  h2(2)  +  rx  +  r2 

dPx  (2) 

= 

hl (2)  +  rl  +  r2 

dP2  (2) 

= 

2h 2(2)  +  h2(l)  +  r:  +  r2 

The  second  derivative  length  Lp  of  (22)  equals  3. 
(20) -(22)  takes  the  following  form: 


^ (2)  and  the 
16.  We  also  have 


denote  the  flows 
respectively  at 
path  lengths  are 


The  algorithm 
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OD  Pair  (1 >5) : 

If  dPl(l)  >  dp2(l) 


hid)  hjd)  -  -j  [dpi(1)  -  dp2(1)]} 

h2d)  +  ri  -  ma*{°’  hid)  -  3  [dpi(1)  -  dp2(1)]} 


If  dPl(l)  -dP2(l) 


h2(l)  *■  max{0,  h2(l)  -  -  [dpz(1)  - 

V1)  -  r1  -  max{0,  h2(l)  -  ^  [dp2(1)  -  dp^  a)]} 


If  dPl(2)  >  dP-,(2) 


hx(2)  max{0,  hj (2)  -  3  [dp^2j  ~  dj>2(2)^ 


h2(2)  -  r2  -  max{0,  ^(2)  -  [d^  -  d^]} 


If  d  <  d 

Pi (2)  -  P2(2) 


h2(2)  «-  max{0,  h2 (2)  -  3  ^dp2(2)  '  dPl(2)^ 

\ 

hi (2)  -  t2  '  h2(2)  *  ~J  [dp2(2)  -  dPl(2)]} 


Notice  that  the  presence  of  the  link  (4,5)  does  not  affect 
at  all  the  form  of  the  iteration  and  indeed  that  should  be  so  since 
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the  total  flow  of  link  (4,5)  is  always  equal  to  independently 

of  the  routing. 

The  following  table  gives  sequences  of  successive  objective 
function  values  obtained  by  the  algorithm  for  different  stepsize 
values,  and  the  "all  OD  pair  at  once"  and  "one  OD  pair  at  a  time" 
modes  of  implementation.  The  network  topology  is  the  same  as  above 
except  that  the  inconsequential  link  (4,5)  is  deleted.  The  starting 
point  for  all  rvrns  is 

hj (1)  =  0,  h°(l)  =  4,  h°(2)  =  0,  h°(2)  =  8, 


i.e.  all  flow  is  initially  routed  through  the  middle  link  (3,4)  which 
is  the  worst  possible  starting  flow  pattern.  The  stepsize  is  chosen 


to  be  constant  at  one  of  three  possible  values  (ct^  =  0.5,  =  1, 

=  1.8).  It  can  be  seen  that  for  a  unity  stepsize  the  convergence 
to  a  neighborhood  of  a  solution  is  very  fast  in  both  the  one-at-a- 


time  and  the  all-at-once  modes.  As  the  stepsize  is  increased  the 
danger  of  divergence  increases  with  divergence  occuring  typically 
first  for  the  all-at-once  mode.  This  can  be  seen  from  the  table 


where  for  =  1.8  the  algorithm  converges  in  the  one-at-a-time 
mode  but  diverges  in  the  all-at-once  mode. 


We  finally  mention  two  possible  distributed  implementations  of 
iteration  (20).  One  possibility  is  for  all  nodes  i  to  broadcast  to 

V 

all  other  nodes  the  current  total  flows  F.  of  their  outgoing  links 

lm 

(i,m).  Each  node  then  computes  the  MFDL  paths  of  OD  pairs  for  which 
it  is  the  origin  and  executes  iteration  (20)  for  some  fixed  stepsize 


This  corresponds  to  the  "all  OD  pairs  at  once"  mode  of  implementation 

2 

and  requires  O(LN)  or  0(L  )  link  flow  transmissions  depending  on 
whether  a  spanning  tree  or  flooding  is  used  for  broadcasting. 

The  other  possibility  is  for  all  nodes  i  to  broadcast  to  a 
special  node  (say  node  1)  the  current  total  flows  .  This  node 
computes  the  MFDL  paths  of  OD  pairs  for  which  it  is  the  origin  and 
executes  iteration  (20)  for  a  unity  stepsize.  It  then  computes  the 
adjusted  values  of  the  total  link  flows  taking  into  account  the 
results  of  its  own  iteration  and  passes  these  values  to  a  neighbor¬ 
ing  node  who  does  the  same  thing  until  all  nodes  are  taken  up  cyclically. 
This  corresponds  to  the  "one  at  a  time"  mode  of  implementation.  It 
requires  the  same  order  of  communication  complexity  as  the  "all  at 
once"  mode  described  earlier.  For  both  implementation  modes  the  com¬ 
munication  complexity  is  more  favorable  than  the  one  of  the  distributed 
implementation  of  the  Frank-Wolfe  method  described  in  the  previous 
section. 
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7.  Combined  Optimal  Routing  and  Flow  Control 

While  routing  is  concerned  with  the  choice  of  good  routes  for 
messages  (or  other  data  units  such  as  packets,  virtual  circuits,  etc.) 
that  have  been  accepted  into  the  network,  flow  control  deals  with  the 
question  of  whether  particular  messages  (or  other  data  units)  should 
be  allowed  to  enter  the  network.  It  is  possible  to  consider  several 
types  of  flow  control  in  a  data  network  depending  on  the  points 
between  which  it  is  exercised  (see  [18]  for  a  survey).  Thus  link- 
by-link  (or  hop  level)  flow  control  refers  to  procedures  that  limit 
the  amount  of  flow  from  the  headnode  to  the  tai Inode  of  a  link.  End- 
to-end  flow  control  refers  to  procedures  that  limit  the  amount  of 
flow  that  is  input  from  external  sources  at  an  origin  node  of  the 
communication  subnetwork  and  is  destined  to  another  node,  i.e.  the 
input  flows  rw  introduced  in  Section  4  (cf.  the  routing  problem  (5)]. 

This  section  deals  with  the  possibility  of  combining  routing 
with  end-to-end  flow  control  by  adjusting  optimally  both  the  routing 
variables  as  well  as  the  inputs  r^.  If  the  input  r^  is  measured  in 
terms  of  virtual  circuits,  then  its  optimal  value  can  be  viewed  as  a 
target  value  that  the  origin  node  strives  to  achieve  by  blocking  or 
allowing  new  calls  generated  from  external  sources.  Similarly  in 
integrated  voice  and  data  networks  r^  can  be  related  to  rate  of  encod¬ 
ing  of  digitied  voice  and  can  be  directly  adjusted  at  the  origin 
nodes  (see  [12]).  When  flow  control  is  effected  in  terms  of  end-to- 
end  windows  (see  [18])  there  is  some  difficulty  in  determining  window 
sizes  that  achieve  the  desired  optimal  inputs  r^.  We  refer  the  reader 
to  [10],  [11]  for  related  discussion.  In  what  follows  we  concentrate 


on  formulating  a  problem  of  adjusting  routing  variables  together  with 
the  inputs  r^  so  as  to  minimize  some  "reasonable"  objective  function. 
We  subsequently  show  that  this  problem  is  mathematically  equivalent 
to  the  optimal  routing  problem  examined  in  sections  4-6  (rw:  fixed) 
and  therefore  the  optimality  conditions  and  algorithms  given  there  are 
applicable. 

If  we  try  to  minimize  the  objective  function  Y  D.,  (F.,) 

(i,k)  eL  lK  lJc 

used  for  the  routing  problem  [cf.  (5)]  with  respect  to  both  the  path 

flows  {h  }  and  the  inputs  {r  },  we  unhappily  find  that  the  optimal 
p  w 

solution  is  to  set  h  =  0  and  r  =  0  for  all  p  and  w.  This  indicates 

p  w  r 

that  the  objective  function  should  be  modified  to  include  a  penalty 
for  the  inputs  r becoming  too  small  and  leads  to  the  problem 


minimize  [  D  [  £  l  <5_  (i  ,k)hJ  +  l  e  (rj 
(i,k)eL  K  weW  pePw  p  p  weW 

subject  to  Y  h  =  r  ,  V  weW 


peP 


w 


hp  >  °>  v  Pepw  »  wew 

0  <  r  <  r  ,  V  weW. 

—  w  —  w 


Here  the  minimization  is  to  be  carried  out  jointly  with  respect  to 

{hp}  and  irwL  The  given  values  rw  represent  the  amount  of  input 

desired  by  OD  pair  w,  i.e.  the  maximum  amount  of  input  for  w  that 

would  result  if  no  flow  control  was  exercised.  The  functions  e  are 

w 


of  the  form  shown  below 
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and  provide  a  penalty  for  throttling  the  inputs  r  .  They  are  assumed 

to  be  convex,  and  monotonically  decreasing  on  the  half  line  We 

assume  that  their  first  and  second  derivatives  e'  and  e"  exist  on 

w  w 

(O,00)  and  are  strictly  negative  and  positive  respectively.  An 

interesting  class  of  functions  e^  is  specified  by  the  following 

formula  for  their  first  derivative 

b 


e'  (r  )  = 

w 


{%  )  "  ’  v  b“ 


given  positive  constants 


As  will  be  explained  later  in  this  section  (see  also  [10]),  the 
parameters  aw  and  b^  influence  the  magnitude  of  input  r^  and  the 
priority  (relative  magnitude  of  input  allowed  under  heavy  load  con¬ 
ditions)  of  user  class  w  respectively. 

Similarly  as  for  the  routing  problem  h^  denotes  the  flow  on  path 
p,  however  it  is  important  to  note  that  some  additional  flexibility 
is  provided  by  adopting  a  broader  view  of  w  and  considering  it  as 
a  class  of  users  sharing  the  same  set  of  paths  P  .  This  allows  the 
possibility  of  providing  different  priorities  (i.e.  different  functions 
ew)  to  different  classes  of  users  even  if  they  share  the  same  paths. 
Furthermore  it  is  possible  to  consider  a  problem  where  consists 


mmm mmmm 
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of  a  single  path  for  each  w  in  which  case  the  routing  component  of 
the  problem  is  essentially  eliminated  (h  =  r^) .  A  problem  of  strict¬ 
ly  flow  control  results  namely  that  of  deciding  upon  the  optimal 
fraction  of  the  desired  input  flow  of  each  user  class  that  should  be 
allowed  into  the  network. 

We  now  show  that  the  combined  routing  and  flow  control  problem 

(23)  is  mathematically  equivalent  to  a  routing  problem  of  the  type 

considered  in  Section  4  [cf.  (5)].  Indeed  let  us  introduce  a  new 

variable  h  for  each  weW  via  the  equation 
Pw 


r  =  r  -  h 
w  w  p 


w 


(25) 


We  may  view  h  as  the  amount  of  overflow  (portion  of  r  blocked  out  of 
Pw  - 

the  network)  and  consider  it  as  a  flow  on  an  overflow  link  p  connect- 

-  rw 

ing  directly  the  origin  and  destination  nodes  of  w  as  shown  below 


If  we  define  a  new  function  e  by 

w  1 


ew(V  =  ew(Vhp  5 

*w 


(26) 
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problem  (23)  becomes  in  view  of  (25) 


minimize 


I  ,  »lk!  F  l  *  F  e„<hp  > 

(i,k)eL  wdV  peP  ”  "  w eW  ^w 


(27) 


subject  to  I  h+h  =  r  .  V  weW 
P  P  W  * 

peP  *  *w 

r  W 


h  >  0,  V  peP  ,  weW. 
p  —  r  w 


The  form  of  the  function  e  of  (25)  is  shown  below 

w  k 


ew(hoJ 


'W 

If  e  (r  )  ->  00  as  r  +  0  (i.e.  there  is  "infinite  penalty"  for  com- 
wv  w  w  r 


W 


pletely  shutting  off  the  class  of  users  w) ,  "hen  we  have  e  (h  )-*•«> 

w  Pw 

as  the  overflow  h  approaches  its  maximum  value — the  maximum  input 
_  w  „ 

r  ..  So  we  may  view  e  as  a  "delay"  function  for  the  overflow  link 


w 


w 


and  consider  rw  as  the  "capacity"  of  the  link. 

It  is  now  clear  that  problem  (27)  is  of  the  type  considered  in 
Sections  4-6  and  that  the  algorithms  and  optimality  conditions  given 
there  apply.  In  particular  application  of  the  optimality  conditions 
of  Section  4  yields  the  following  result  [cf.  (13)]: 

A  feasible  set  of  path  flows  {h*}  and  inputs  {r*}  is  optimal 
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for  problem  (23)  if  and  only  if  the  following  conditions  hold  for  each 
weW : 

h*  >0,  peW  — d*  =  min  {d* },  d*  <  -e' (r*)  (28a) 

p  w  r  p  -eP  p  P  -  w  w 


r 


* 

w 


<  r 


w 


-e ' (r*) 
w  w 


min  (d^_ } 
pePw  P 


(28b) 


where  d*  is  the  first  derivative  length  of  path  p  [d*  =  £  Dik^ik-* 


P  (i,k)eP 


;>■ 


and  is  the  total  flow  of  link  (i,k)  corresponding  to 
cf.  (11)]. 

The  meaning  of  the  parameters  aw  and  bw  in  the  objective  function 


specified  by  the  formula  [cf.  (24)] 

b 

,  .  .  /  aw  \ 

e' 


/  \  b 

/  a  \  w 

»'  r  1  -  I  —  1 
"w^vr  V  r  / 

\  w  / 


(29) 


can  now  be  made  clear  in  the  light  of  the  optimality  conditions  (28). 

Consider  two  distinct  classes  of  users  \  ^  and  w2  sharing  the  same  paths 

(P  =  P  ) .  Then  the  conditions  rl6)  imply  that  at  an  optimal  solution 
W1  W2 

in  which  both  classes  of  users  are  throttled  (r*  <  r  ,  r*  <  r  )  we 

W,  w’  w,,  w' 


have 


-e'  (r*  ) 

t.r  N  ur  * 


w. 


-e 


(r*  ) 
W2  W2 


min 

pePw 

1 


{d^}  =  min 
P  PePw 


(dl) 

p 


If  e*  and  e'  are  specified  by  parameters  a  ,  b  and  a  ,  b  as 
W1  2  W1  W1  w2  W2 

in  (29)  we  see  that: 


MMIaiiiktt 


-53- 


a)  If  b  =  b  then 
w,  w„ 


and  it  follows  that  the  parameter  a^  influences  the  optimal  relative 
input  rate  of  the  user  class  w. 

b)  If  a  =  a  and  b  <  b  (see  the  figure  below) 

1  2  W1  w2 


then  the  condition  (30)  specifies  that  under  heavy  load  conditions 

(r*  ,  r*  :  small)  the  user  class  w-  (the  one  with  higher  parameter 
wi  2  z 

bw)  will  be  allowed  a  larger  input.  It  follows  that  the  parameter  b 

influences  the  relative  priority  of  the  user  class  w  under  heavy  load 

conditions. 

Optimal  solutions  of  problem  (23)  possess  several  interesting 
properties.  We  refer  to  [10]- [12]  for  a  more  complete  discussion. 

The  reader  may  wish  to  verify  as  an  exercise  that  the  set  of  optimal 
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{r*  }  is  unique  (although  the  set  of  optimal  (h*)  need  not  be  unique), 
w  p 

Furthermore  if  r*  <  r^,  i.e.  a  positive  amount  of  input  for  w  is  throt¬ 
tled,  then  the  optimal  input  r*  will  not  change  if  r^  is  increased.  This 
means  that  the  optimal  input  r*  is  insensitive  to  increased  demand  from 
the  user  class  w  beyond  a  certain  threshold. 
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